Built by aligning high-quality genomes, saved as paths through the pangenome.
Human Pangenome Reference Consortium (HPRC)
Liao, Asri, Ebler, et al. Nature 2023
A snarl is a subgraph bounded by two node sides that are:
A snarl is a subgraph bounded by two node sides that are:
A snarl is a subgraph bounded by two node sides that are:
Separable: splitting the node into its two node sides separates a subgraph from the graph
Minimal: there are no nodes within the snarl that are separable with either boundary node side
A snarl is a subgraph bounded by two node sides that are:
Separable: splitting the node into its two node sides separates a subgraph from the graph
Minimal: there are no nodes within the snarl that are separable with either boundary node side
A run of consecutive snarls and nodes is called a chain
Snarls and chains can be nested inside of each other.
The nested relationship of snarls and chains is described by the snarl tree.
Snarls and chains can be nested inside of each other.
The nested relationship of snarls and chains is described by the snarl tree.
Snarls and chains can be nested inside of each other.
The nested relationship of snarls and chains is described by the snarl tree.
Snarls and chains can be nested inside of each other.
The nested relationship of snarls and chains is described by the snarl tree.
Netgraphs are a representation of snarls with their child chains collapsed into a single node
vg deconstruct)vcf + graph + decomposition for this graph
vg deconstruct)vcf + graph + decomposition for this graph
vg deconstruct)vcf + graph + decomposition for this graph
vg giraffeShort reads
Long reads
Seeding with Minimizer Index
Zip code tree making with Distance Index
Seeding with Minimizer Index
Zip code tree making with Distance Index
Chaining with Zip Code Trees
Seeding with Minimizer Index
Zip code tree making with Distance Index
Chaining with Zip Code Trees
Alignment with GBWT/graph
vg graph formats and indexesIndexes
.gbwt (Graph Burrows Wheeler
Transform): haplotype paths.gg (GBWT Graph): node sequences for a
GBWT.dist (Distance Index): snarl
decomposition plus minimum distances.zipcodes: per-node distance
information used by vg giraffe.min (Minimizer Index): minimizers
used by vg giraffe.gcsa (Generalized Compressed Suffix
Array): substring index used by vg map and
vg mpmapGraphs
.gbz (GBWT + GG): the graph induced by
the GBWT.hg (/.vg) (HashGraph):
graph format optimized for speed.pg (/.vg) (PackedGraph):
graph format optimized for space efficiency.xg: older graph format.vg: protobuf-based graph formatvg wiki
snarls paper doi: 10.1089/cmb.2017.0251
short read giraffe paper doi: 10.1126/science.abg8871
long read giraffe paper doi: 10.1101/2025.09.29.678807